Method and system for behavior query construction in temporal graphs using discriminative sub-trace mining

ABSTRACT

A method and system for constructing behavior queries in temporal graphs using discriminative sub-trace mining. The method includes generating system data logs to provide temporal graphs, wherein the temporal graphs include a first temporal graph corresponding to a target behavior and a second temporal graph corresponding to a set of background behaviors, generating temporal graph patterns for each of the first and second temporal graphs to determine whether a pattern exists between a first temporal graph pattern and a second temporal graph pattern, wherein the pattern between the temporal graph patterns is a non-repetitive graph pattern, pruning the pattern between the first and second temporal graph patterns to provide a discriminative temporal graph, and generating behavior queries based on the discriminative temporal graph.

RELATED APPLICATION INFORMATION

This application claims priority to provisional application Ser. No.62/075,478 filed on Nov. 5, 2014, incorporated herein by reference.

BACKGROUND

1. Technical Field

The present invention generally relates to methods and systems forbehavior query construction in temporal graphs. More particularly, thepresent disclosure is related to methods and systems for behavior queryconstruction in temporal graphs using discriminative sub-trace mining.

2. Description of the Related Art

Because computer systems are widely deployed to manage businesses,ensuring the proper functioning of computer systems is an importantaspect for the execution business. For example, if a system iscompromised and/or encounters system failures, the security of thesystem cannot be guaranteed and/or the services hosted in the system maybe interrupted. However, maintaining the proper functioning of computersystems is a challenging task, since system administrators have limitedvisibility into these complex systems.

Generally, it is difficult for system administrators to cope withvulnerabilities to computer systems, such as key-loggers, spyware,malware, etc., without monitoring and understanding system behaviors.System behaviors may include a set of information generated from when asystem entity, such as a program, is executed to when the system entityis terminated, which is generally referred to as a path and/or executiontrace. Execution traces of how system entities (e.g., processes, files,sockets, pipes, etc.) interact with each other at the operating systemlevel may be collected when monitoring security-related behaviors.

However, monitoring a computer system generates huge amounts of data,typically stored in application logs that record all of the interactionsamong the system entities over time. For example, the logs include asequence of events each of which describes at which time what kind ofinteractions happened between which system entities. Existing solutionsrequire administrators to search among the application logs, which canbe inefficient and ineffective, since some application logs (e.g., fileaccess logs, firewall, network monitoring, etc.) provide only partialinformation about system behaviors.

Thus, better understanding of system behaviors and identification ofpotential system risks and malicious behaviors becomes a challengingtask for system administrators due to the dynamics and heterogeneity ofthe system data.

SUMMARY

In one embodiment of the present principles, a method for behavior queryconstruction in temporal graphs using discriminative sub-trace mining isprovided. In an embodiment, the method may include generating systemdata logs to provide temporal graphs, wherein the temporal graphsinclude a first temporal graph corresponding to a target behavior and asecond temporal graph corresponding to a set of background behaviors,generating temporal graph patterns for each of the first and secondtemporal graphs to determine whether a pattern exists between a firsttemporal graph pattern and a second temporal graph pattern, wherein thepattern between the temporal graph patterns is a non-repetitive graphpattern, pruning the pattern between the first and second temporal graphpatterns to provide a discriminative temporal graph, and generatingbehavior queries based on the discriminative temporal graph

In another embodiment, a system for behavior query construction intemporal graphs using discriminative sub-trace mining is provided. In anembodiment, the system may include a monitoring device to generatesystem data logs to provide temporal graphs, wherein the temporal graphsinclude at least a first temporal graph corresponding to a targetbehavior and a second temporal graph corresponding to a set ofbackground behaviors, a temporal graph pattern generator to generatetemporal graph patterns for each of the first and second temporalgraphs, a pattern determiner to determine whether a pattern existsbetween a first temporal graph pattern and a second temporal graphpattern, wherein the pattern between the temporal graph patterns is anon-repetitive graph pattern, a pattern pruner, coupled to a bus, toprune the pattern between the first and second temporal graph patternsto provide at least one discriminative temporal graph, and a behaviorquery generator, coupled to the bus, to generate behavior queries basedon the at least one discriminative temporal graph.

In yet another aspect of the present disclosure, a computer programproduct is provided that includes a computer readable storage mediumhaving computer readable program code embodied therein for performing amethod for behavior query construction in temporal graphs usingdiscriminative sub-trace mining. In an embodiment, the method mayinclude generating system data logs to provide temporal graphs, whereinthe temporal graphs include a first temporal graph corresponding to atarget behavior and a second temporal graph corresponding to a set ofbackground behaviors, generating temporal graph patterns for each of thefirst and second temporal graphs to determine whether a pattern existsbetween a first temporal graph pattern and a second temporal graphpattern, wherein the pattern between the temporal graph patterns is anon-repetitive graph pattern, pruning the pattern between the first andsecond temporal graph patterns to provide a discriminative temporalgraph, and generating behavior queries based on the discriminativetemporal graph

These and other features and advantages will become apparent from thefollowing detailed description of illustrative embodiments thereof,which is to be read in connection with the accompanying drawings.

BRIEF DESCRIPTION OF DRAWINGS

The present principles will provide details in the following descriptionof preferred embodiments with reference to the following figureswherein:

FIG. 1 is a block/flow diagram illustratively depicting an exemplarysystem/method for constructing behavior queries in temporal graphs usingdiscriminative sub-trace mining, in accordance with an embodiment of thepresent principles;

FIG. 2 shows an illustrative example of temporal graphs, in accordancewith an embodiment of the present principles;

FIG. 3 shows an exemplary a growth pattern, in accordance with anembodiment of the present principles;

FIG. 4A shows an exemplary a growth pattern, in accordance with anembodiment of the present principles;

FIG. 4B shows an exemplary a growth pattern, in accordance with anembodiment of the present principles;

FIG. 4C shows an exemplary a growth pattern, in accordance with anembodiment of the present principles;

FIG. 5 shows an exemplary residual graph, in accordance with anembodiment of the present principles;

FIG. 6 is a block/flow diagram illustratively depicting an exemplarysystem/method for pruning a pattern between temporal graph patterns, inaccordance with an embodiment of the present principles;

FIG. 7 is a block/flow diagram illustratively depicting an exemplarysystem/method for pruning a pattern between temporal graph patterns, inaccordance with an embodiment of the present principles;

FIG. 8 is an illustrative example of a sequence-based representationbetween temporal graph patterns, in accordance with the presentprinciples;

FIG. 9 shows an exemplary processing system/method to which the presentprinciples may be applied, in accordance with an embodiment of thepresent principles; and

FIG. 10 shows an exemplary processing system/method for constructingbehavior queries in temporal graphs using discriminative sub-tracemining, in accordance with an embodiment of the present principles.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Methods and systems for behavior query construction in temporal graphsusing discriminative sub-trace mining are provided. One challenge inmonitoring and understanding system behaviors in computer systems toidentify potential system risks using behavior queries is theheterogeneity and overall amount of the system data. According to oneaspect of the present principles, the methods, systems and computerprogram products disclosed herein employ discriminative sub-trace miningto temporal graphs to mine discriminative sub-traces as graph patternsof security-related behaviors and construct behavior queries that aremapped to user-understandable semantic meanings and are effective forsearching the execution traces. Security-related behaviors may include,but are not limited to, file compression/decompression, source codecompilation, file download/upload, remote login, and system softwaremanagement (e.g., installation and/or update of software applications).In addition, the instant methods and systems prune graph patterns thatshare similar growth trends, thereby significantly reducing computationtime and increasing data storage efficiency, since repetitive searchesare avoided and/or redundant searches are pruned without compromisingpattern quality.

To ensure the security of a computer system enterprise, a systemadministrator may query system data logs to determine if a particularsecurity behavior has occurred, such as activity over weekend whentypically activity on the system is fairly limited. For illustrativepurposes, activities may include remote access to the system,compression of several files, and/or transfer of the files to a remoteserver. Generally, the system administrator may be required to submitthree separate queries (e.g., remote access login, compression of files,and transfer to remote server) and perform a search over the entiresystem data log to find a security related activity. In some instances,it may be difficult for system administrators to directly query suchmonitoring data, represented as temporal graphs, for security-relatedbehaviors, referred to as behavior queries, since temporal graphs arecomplex with many tedious low-level entities (e.g., processes, files,etc.) recorded in the system data logs that cannot be directly mapped toany high-level activity (e.g., remote access login, compression offiles, and transfer to remote server). In such instances, a semantic gapexists between such system-level interactions and the security-relatedbehaviors of interest. To locate high-level activities, a systemadministrator must know which processes or files are involved in thehigh-level activity and in what order over time the low-level entitiesare involved in the high-level activity in order to write a query.However, due to the complexity of such temporal graphs, it becomestime-consuming for system administrators to manually formulate usefulqueries in order to examine abnormal activities, attacks, andvulnerabilities in computer systems.

To overcome this problem, the present principles teaches identifying themost discriminative patterns for target behaviors in temporal graphs andemploy the most discriminative patterns as behavior queries.Accordingly, these behavior queries, which may consist of only a fewedges, are easier to interpret and modify as well as being robust tonoise. In accordance with one embodiment, a positive set and a negativeset of temporal graphs may be determined, and temporal graph patternswith maximum discriminative score may be identified, as will bedescribed in further detail below. Accordingly, a discriminative patternshould frequently occur in target behaviors and rarely exist in otherbehaviors.

Referring to the drawings in which like numerals represent the same orsimilar elements and initially to FIG. 1, FIG. 1 shows a block/flowdiagram illustratively depicting exemplary methods/systems 100 forconstructing behavior queries in temporal graphs using discriminativesub-trace mining according to one embodiment of the present principlesis shown.

Generally, pattern mining may characterize large and complex data setsinto concise forms. Discriminative graph pattern mining is a featureselection method that may be applied in graph classification tasks todistinguish characteristics and identify differences between data sets.Specifically, discriminative pattern mining is a technique concernedwith identifying a set of patterns and the frequency of those patternsthat occur in data sets. According to one embodiment, discriminativepattern mining on temporal graphs may be implemented to identifypatterns related to security-related behaviors in computer systems.

In block 102, the method 100 may include monitoring system data (e.g.,execution of behavior traces at a computer system) and generating systemdata logs. System data logs, which may include raw system behaviors,target behaviors and/or background behaviors, may be collected and maybe employed as input data. The system data logs may include informationrelating to how system entities interact with each other at theoperating system (e.g. execution and/or behavior traces) and may includetimestamps. In some embodiments, processes may be monitored and/orcollected along with any corresponding files and/or timestamps. Theprocesses, files and/or timestamps may be collected and/or generate asystem data log and may be used to generate corresponding temporalgraphs.

In one embodiment, the system data logs may be generated in a closedenvironment where only one target behavior is performed. For example,the system data logs include a target behavior that is independently runwithout other behaviors (e.g., background behaviors) runningconcurrently. In addition, the system data logs may include backgroundbehaviors independently run without the target behavior runningconcurrently.

In one embodiment, the system data logs may be modeled and/or beprovided as temporal graphs corresponding to the system data logs, withnodes being system entities and edges being their interactions withtimestamps. In an embodiment, the temporal graphs may include at least afirst temporal graph corresponding to a target behavior and a secondtemporal graph corresponding to a set of background behaviors, as shownin block 102. Accordingly, the system data of a target behavior maygenerate a temporal graph of no more than a few thousand of nodes and/oredges. In addition, the system data of a set of background behaviors maygenerate a temporal graph comprising nodes and/or edges.

Temporal graphs are a graph representation of a set of objects wheresome pairs of objects, referred to as nodes, are connected by links andare referred to as edges. Generally, a temporal graph G is representedby a tuple (V,E,A,T), where V is a set of nodes, E⊂V×V×T is a set ofdirected edges that are totally ordered by their timestamps, A:V→Σ is afunction that assigns labels to nodes (Σ is a set of node labels), and Tis a set of possible timestamps, non-negative integers on edges. In someembodiments, the method employs temporal graphs with total edge order.In temporal graphs, edges may have timestamps. Therefore, edges may beranked and/or ordered by the timestamps. If edges have a total order,then for any edges e₁ and e₂, either e₁'s timestamp may be smaller thane₂'s timestamp, or e₁'s timestamp may be greater than e₂'s timestamp. Inother words, when temporal graphs include total edge order, no two edgesshare an identical timestamp. It should be noted that the presentprinciples may be applied to temporal graphs with multi-edges, nodelabels and edge timestamps, as well as edge labels.

In an embodiment, the system data logs for a target behavior may includea set of positive temporal graphs and the system data logs forbackground behaviors may include a set of negative temporal graphs. Forexample, in block 102, the system data logs that include a targetbehavior may be treated as a set of positive temporal graphs, G_(p), andthe system data logs that include background behaviors may be treated asa set of negative temporal graphs, G_(n). It should be noted that systemdata logs for normal and/or abnormal behaviors (e.g., intrusionbehaviors) may be used as positive datasets, which may be employed togenerate graph pattern queries for normal and/or abnormal behaviors.

In a further embodiment, the temporal graphs may include temporalsubgraphs. Accordingly, the temporal subgraphs may include at least afirst temporal subgraph corresponding to a target behavior and a secondtemporal subgraph corresponding to a set of background behaviors, asshown in block 102. For example, in some embodiments, it mayadvantageous and efficient to use discriminative subgraphs (hereinafter“subgraph”) of the temporal graphs to capture the footprint of a targetbehavior instead of employing the entire raw temporal graph from thesystem data logs as a behavior query.

Given two temporal graphs, namely G=(V,E,A,T) and G′=(V′,E′,A′,T′),temporal graph G is a subgraph of G′ (e.g., G⊂ ^(t)G′) if and only ifthere exists two injective functions, such as f:V→V′ and τ:T→T′, suchthat node mapping, edge mapping, and edge order are preserved. Nodemapping may be defined as ∀u∈V, A(u)=A′(f(u)), where V is the set ofnodes in a temporal graph G, u is a node in temporal graph G, and f(u)is the node in G′ which u maps to, such that u and f(u) share anidentical node label. Edge mapping may be defined as∀(u,v,t)∈E,(f(u),f(v),τ(t))∈E′, where E is the set of edges in temporalgraph G, (u,v,t) is an edge in G between node u and node v withtimestamp t, E′ is the set of edges in G′, and (f(u),f(v),τ(t)) is anedge in G′ between node f(u) and node f(v) with timestamp 20.Accordingly, (u,v,t) maps to (f(u),f(v),τ(t)), where node u, node v, andtimestamp t in temporal graph G map to node f(u), node f(v), andtimestamp τ(t) in graph G′, respectively. Edge order may be defined as∀(u₁,v₁,t₁),(u₂,v₂,t₂)∈E, sign(t₁−t₂)=sign(τ(t₁)−τ(t₂)), such thattimestamp t₁ and t₂ in G map to timestamp τ(t₁) and τ(t₂) in G′,respectively. Thus, sign(t₁−t₂)=sign(τ(t₁)−τ(t₂)) means (1) if t₁ issmaller than t₂ (e.g., the sign of t₁−t₂ is negative), then τ(t) issmaller than τ(t₂) (e.g., the sign of τ(t₁)−(t₂) is negative); and (2)if t₁ is greater than t₂ (e.g., the sign of t₁−t₂ is positive), thenτ(t₁) is greater than r(t₂) (e.g., the sign of τ(t₁)−(t₂) is positive).Temporal graph G′ is a match of temporal graph G, which may be denotedas G′=_(t)G, when f and τ are bijective functions, where every elementof one set is paired with one element of the other set, and everyelement of the other set is paired with one element of the first setsuch that there are no unpaired elements. An illustrative example oftemporal subgraphs are illustratively shown in FIG. 2, which will bedescribed in further detail below.

In block 104, the method may include generating temporal graph patternsfor each of the first and second temporal graphs to determine whether apattern exits between the first and second temporal graph patterns. Inone embodiment, the pattern between the first and second temporal graphpatterns is a non-repetitive graph pattern, as will be described infurther detail below. A temporal graph pattern g=(V,E,A,T) is a temporalgraph pattern where all of timestamps between the edges are between one(1) and the total amount of edges in the temporal graph, such that ∀t∈T,1≦t≦|E|. Unlike general temporal graphs, where timestamps could bearbitrary non-negative integers, timestamps in temporal graph patternsare aligned (e.g., from 1 to |E|) and only total edge order is kept.

In an embodiment, the temporal graph patterns, such as the temporalgraph patterns for each of the first and second temporal graphs, may beT-connected graph patterns. Temporal graphs may be differentiatedbetween T-connected temporal graphs and non T-connected temporal graphsby distinguishing the type of connections between the temporal graphs. Atemporal graph G=(V,E,A,T) is defined as T-connected if ∀(u,v,t)∈E whereG is a temporal graph, V is the set of nodes in G, E is the set of edgesin G, A is a function that assigns labels to nodes in G, and T is afunction that assigns timestamps to edges in G. Thus, a temporal graph Gis T-connected if (u, v, t), which is an edge in G between node u andnode v with timestamp t, such that the edges whose timestamps aresmaller than t form a connected graph. An illustrative example ofT-connected temporal graphs and non T-connected temporal graphs areillustratively shown in FIG. 2, which will be described in furtherdetail below.

With continued reference to FIG. 1, the method includes determining if apattern is formed between the temporal graph patterns, as shown in block104. In an embodiment, a determination is made whether or not a patternexists between a first temporal graph pattern and a second temporalgraph pattern corresponding to the first and second temporal graphs,respectively. In a preferred embodiment, the pattern is a non-repetitivegraph pattern.

In one embodiment, a pattern is determined when each edge in a firsttemporal graph pattern corresponds to each edge in a second temporalgraph pattern such that the node mappings between each edge areone-to-one. For example, assuming that a first temporal graph patterng₁=(V₁,E₁,A₁,T₁), and a second temporal graph pattern g₂=(V₂,E₂,A₂,T₂),|V₁|=|V₂|, and a total amount of edges in the first temporal graphpattern is equal to a total amount of edges in the second temporal graphpattern, such that |E₁|=|E₂|, a linear scan may be conducted over edgesin g₁. For each edge (u₁,v₁,t)∈E₁ in the first temporal graph pattern,an edge is located in the second temporal graph pattern, such as theedge (u₂,v₂,t)∈E₂. If such an edge exists, the mapping from u₁ to u₂ andthe mapping from v₁ to v₂ is verified to ensure that such mappings areone-to-one. If both are, then (u₁,v₁,t) matches (u₂,v₂,t)∈E₂.Accordingly, a pattern between the first temporal graph pattern and thesecond temporal graph pattern exists (e.g., g₁=_(t)g₂) when all theedges in g₁ find their matches in g₂. If two bijective functions arefound, for example, f:V₁→V₂ and τ:T₁→T₂, the linear scan follows theunique way to match edge timestamps between g₁ and g₂ and |E₁|=|E₂|, τis found and bijective. Accordingly, the present principles guaranteesthe node mapping f is one-to-one and, moreover, a full mapping of f isgenerated because |E₁|=|E₂| and all the nodes in g₁ and g₂ are mapped.

In one embodiment, at least two temporal graph patterns are determinedwhether or not they are identical in linear time. It should be notedthat pattern growth is more efficient in temporal graphs compared withnon-temporal graphs. For example, the computation advantages of temporalgraphs originate from the following property. Assuming that g₁ and g₂are temporal graph patterns, if g₁=_(t)g₂, the mappings f and τ betweenthem are unique. This is referred to herein as Lemma 1. It may beassumed that g₁=(V₁,E₁,A₁,T₁) and g₂=(V₂,E₂,A₂,T₂). Since g₁ and g₂ aretemporal graph patterns, we have ∀(u₁,v₁,t₁)∈E₁, 1≦t₁≦|E₁| and∀(u₂,v₂,t₂)∈E₂, 1≦t₂≦|E₂|. Because g₁=_(t)g₂ and |E₁|=|E₂|,(u₁,v₁,t₁)∈E₁ matches (u₂,v₂,t₂)∈E₂ only if t₁=t₂ in order to preservetotal edge order. Thus, the uniqueness of τ is proved such that τ:T₁→T₂.Since τ is unique, the edge mapping between g₁ and g₂ is unique, andtherefore the node mapping f is also unique such that f:V₁→V₂.

In addition, it is costly to conduct pattern growth for non-temporalgraphs. To grow a non-temporal pattern to a specific larger one, acombination of different ways may be employed. However, in order toavoid repeated computation, additional computations are needed toconfirm whether one pattern is a new pattern or is an already discoveredone. Accordingly, this results in high computation cost, as graphisomorphism is inevitably involved. To reduce the overhead, variouscanonical labeling techniques along with their sophisticated patterngrowth algorithms have been proposed, but the cost is still very highbecause of the intrinsic complexity in graph isomorphism. Unlike miningnon-temporal graphs, the present principles avoids repeated patternsearch without using any sophisticated canonical labeling or complexpattern growth algorithms.

In one embodiment, the pattern may include a consecutive growth pattern.For example, a consecutive graph pattern exists when a pattern betweentemporal graph patterns guides the search in pattern space and conductsa depth-first search, starting with an empty pattern, growing the emptypattern into a one-edge pattern, and exploring all possible patterns inits branch. When one branch is completely searched, additional branchesinitiated by other one-edge patterns may be searched. Advantageously,the present principles enable efficient pattern growth withoutrepetition as well as providing all possible connected temporal graphpatterns. In addition, consecutive growth patterns guarantee that aconnected temporal graph pattern will form another connected temporalgraph pattern without repetition. In an embodiment, a pattern is aconsecutive growth pattern when, given a connected temporal graphpattern g of edge set E and an edge e′=(u′,v′,t′), edge e′ is added intog and another connected temporal graph pattern and t′=|E|+1 results. Anillustrative example of a consecutive growth pattern is illustrativelyshown in FIG. 3, which will be described in further detail below. In afurther embodiment, the consecutive growth pattern may include at leastone of a forward growth pattern, a backward growth pattern, or an inwardgrowth pattern, which will be described in further detail below.

With continued reference to FIG. 1, after the pattern between thetemporal graph patterns is determined, the method includes pruning thepattern to provide at least one discriminative temporal graph, as shownin block 106. In one embodiment, the patterns are pruned to select onlythose sub-relations with maximum frequency and/or maximum discriminativescore. For any temporal graph pattern g, its discriminative score may beevaluated by a discriminative function F, which returns a real value forg as its discriminative score. Among all possible patterns, the patternswith the largest discriminative score have the maximum discriminativescore. In a further embodiment, pruning includes pruning temporalsub-relations, including subgraph pruning and/or supergraph pruning,which will be described in further detail below.

In some embodiments, given a set of temporal graphs G and a temporalgraph pattern g, the frequency of the temporal graph pattern g withrespect to G may be defined as:

${{freq}\left( {G,g} \right)} = {\frac{\left. \left. {{\left\{ G \right.g} \subseteq_{I}{G\bigwedge G} \in G} \right\} \right|}{G}.}$

According to the present principles, a set of positive temporal graphs,G_(p), and a set of negative temporal graphs, G_(n), may be generated tofind the connected temporal graph patterns g″ with maximumdiscriminative score F(freq(G_(p),g*),freq(G_(n),g*)), where F(x,y) is adiscriminative score function with partial anti-monotonicity, such that(1) when x is fixed, y is smaller, then F(x,y) is larger, and (2) when yis fixed, x is larger, then F(x,y) is larger. F(x,y) is a discriminativefunction with two variables x and y, where x is freq(G_(p),g) (e.g., thefrequency of temporal graph pattern g in the positive graph set G_(p))and y is freq(G_(n),g) (e.g., the frequency of pattern g in the negativegraph set G_(n)). It should be noted that F(x,y) may include scorefunctions, such as, for example, G-test, information gain, etc. In apreferred embodiment, a discriminative score function that satisfiespartial anti-monotonicity and best fits query formulation task may beselected. It should also be noted that the discriminative score of atemporal graph pattern g is denoted as F(g).

In one embodiment, the set of positive temporal graphs G_(p) and the setof negative temporal graphs G_(n) may be employed to determine the mostdiscriminative temporal graph patterns in the system data logs. In afurther embodiment, once the discriminative temporal graph patterns aredetermined, the discriminative temporal graph patterns may be ranked bydomain knowledge, including semantic/security implication on node labelsand node label popularity among monitoring data, to identify thepatterns that best serve the purpose of behavior search.

A search algorithm may include a pruning condition, such asconsideration of an upper bound of a pattern's discriminative score.Given a temporal graph pattern g, the upper bound of g indicates thelargest possible discriminative score that could be achieved by g'ssupergraphs. Letting G_(p) and G be a positive graph set and a negativegraph set, respectively, the upper bound may be F(freq(G_(p),g′),freq(G_(n),g′))≦F(freq(G_(p),g),0), since ∀g⊂ _(t)g′,freq(G_(p),g′)≦freq(G_(p),g) and freq(G_(n),g′)≧0. While the upper boundis theoretically tight, it may be ineffective for pruning in practice.

In an embodiment, pruning the pattern between the temporal graphpatterns may include determining a set of residual graphs for eachtemporal graph pattern. For example, if G′ is a subgraph of G, the edgesin G whose timestamps are less than the largest edge timestamp in G′ maybe removed to form a residual graph. Given a temporal graph G=(V,E,A,T)and its subgraph G′=(V′,E′,A′,T′), R(G,G′)=(V_(R),E_(R),A_(R),T_(R)) isG's residual graph with respect to G′, where (1) E_(R)⊂E satisfies∀(u₁,v₁,t₁)∈E_(R), (u₂,v₂,t₂)∈E′, t₁>t₂, and (2) V_(R) is the set ofnodes that are associated with edges in E_(R). The size of the residualgraph R(G,G′) may be defined as |R(G,G′)|=|E_(R)| (e.g., the number ofedges in R(G,G′)). Accordingly, a residual graph's R(G,G′) residual nodelabel set may be defined as L_(R)(G,G′)={A_(R)(u)|∀u∈V_(R)}. Anillustrative example of a temporal graph pattern g, a temporal graph G,a temporal subgraph G′, a residual graph R(G,G′), and a residual nodelabel set L_(R)(G,G′)={A_(R)(u)|∀u∈V_(R)} is illustratively shown inFIG. 5, which will be described in further detail below.

Accordingly, M(G,g) may represent a set including all the subgraphs in Gthat match a temporal graph pattern g. Given G_(p) and g, a positiveresidual graph set R(G_(p),g) may be defined as:

${R\left( {G_{p},g} \right)} = {\bigcup\limits_{G \in G_{p}}\; \left\{ {{R\left( {G,\left. G^{\prime} \middle| {G^{\prime} \in {M\left( {G,g} \right)}} \right.} \right\}}.} \right.}$

Given R(G_(p),g), its residual node label set L(G_(p),g) may then bedefined as:

${L\left( {G_{p},g} \right)} = {\bigcup\limits_{G \in G_{p}}{\bigcup\limits_{G^{\prime} \in {M{({G,g})}}}{{L_{R}\left( {G,G^{\prime}} \right)}.}}}$

Similarly, a negative residual graph set R(G_(n),g) and its residualnode label set L(G_(n),g) may be defined. Accordingly, given a temporalgraph set G and two temporal graph patterns g₁ ⊂ _(t)g₂, ifR(G,g₁)=R(G,g₂), then the node mapping between g₁ and g₂ is unique.

In one embodiment, pruning the temporal graph patterns in block 106 mayinclude subgraph pruning. It should be noted that, for a temporal graphpattern g, g's branch may be employed to refer to the space of patternsthat are grown from g, and F* denotes the largest discriminative scorediscovered. In subgraph pruning, g₁ and g₂ represent temporal graphpatterns where g₁ is discovered before g₂. If g₂ is a temporal subgraphof g₁, and g₁ and g₂ share identical positive residual graph sets, andfor those nodes in g₁ that cannot match to any nodes in g₂, their labelsnever appear in g₂'s residual node label set, subgraph pruning on g₂ maybe performed. Given a discovered pattern g₁=(V₁,E₁,A₁,T₁) and a patterng₂ of node set V₂, if (1) g₂ ⊂g₁, (2) R(G_(p),g₂)=R(G_(p),g₁), and (3)L(G_(p),g₂)∩L_(g) ₁ _(\g) ₂ =φ, where φ is the empty set and L_(g) ₁_(\g) ₂ ={A₁(u)|∀u∈V₁\V₁′} and V₁′⊂V₁ is the set of nodes that map tonodes in V₂, then the search on g₂'s branch may be pruned, if thelargest discriminative score for patterns in g₁'s branch is smaller thanF*. An illustrative example of subgraph pruning is illustratively shownin FIG. 6, which will be described in further detail below.

Accordingly, subgraph pruning prunes pattern space without missing anyof the most discriminative patterns. This may be referred to as Lemma 4.To prove this lemma, g₁ and g₂ are temporal graph patterns, where g₁ isdiscovered before g₂, and it is assumed that g₁ and g₂ satisfy theconditions in subgraph pruning. Since the conditions in subgraph pruningare satisfied, the following facts may be derived: (1)freq(G_(p),g₂)=freq(G_(p),g₁) and (2) pattern growth in g₁'s branch willnever touch the nodes that cannot map to any nodes in g₂ asL(G_(p),g₂)∩L_(g) ₁ _(\g) ₂ =φ. Assume there exists a pattern g₂′ whosediscriminative score is no less than F* and s is the sequence ofconsecutive growth that grows g₂ into g₂′. Since no pattern growth ing₁'s branch will touch the nodes that cannot map to any nodes in g₂, sthen indicates a valid sequence of consecutive growth (with sometimestamp shift) that grows g₁ into g₁′.

By freq(G_(p),g₂)=freq(G_(p),g₁) and R(G_(p),g₂)=R(G_(p),g₁), it may beinferred that freq(G_(p),g₂′)=freq(G_(p),g₁′). Accordingly, g₂′⊂ _(t)g₁′and freq(G_(n),g₂′)≧freq(G_(n),g₁′), and it may be inferred thatF(g₂′)≦F(g₁′), meaning that g₁′ is one of the most discriminativepatterns which contradicts with the condition that none of the patternsin g₁'s branch is the most discriminative. Thus, none of the patterns ing₂'s branch will be the most discriminative, if the conditions insubgraph pruning are satisfied, and none of the patterns in g₁'s branchis the most discriminative. Therefore, we can claim any patterns in g₂'sbranch will have discriminative score less than F*, and the branch canbe safely pruned.

In one embodiment, pruning the temporal graph patterns in block 106 mayinclude supergraph pruning. In supergraph pruning, g₁ and g₂ representtemporal graph patterns where g₁ is discovered before g₂. If g₁ is atemporal subgraph of g₂, and g₁ and g₂ share identical positive residualgraph sets, and g₁ and g₂ have the same number of nodes, then supergraphpruning on g₂ may be performed. Given two patterns g₁ and g₂, where g₁is discovered before g₂ and g₂ is not grown from g₁, if (1) g₂ ⊃ _(t)g₁,(2) R(G_(p),g₂)=R(G_(p),g₁), (3) R(G_(n),g₂)=R(G_(n),g₁), and (4) g₂ andg₁ have the same number of nodes, the search in g₂'s branch may besafely pruned, if the largest discriminative score for g₁'s branch issmaller than F*. An illustrative example of supergraph pruning isillustratively shown in FIG. 7, which will be described in furtherdetail below.

Accordingly, supergraph pruning prunes pattern space without missing themost discriminative patterns. This may be referred to as Proposition 2.Lemma 4 and Proposition 2 may lead to the following theorem, namely,that performing subgraph pruning and supergraph pruning guarantees themost discriminative patterns will still be preserved.

This theorem identifies general cases pruning may be conducted intemporal graph space. In some embodiments, however, it may beadvantageous to conduct either subgraph pruning and/or supergraphpruning when the overhead for discovering these pruning opportunities issmall. The major overhead of subgraph pruning and supergraph pruning maycome from two sources: (1) temporal subgraph tests (e.g., g₂ ⊂ _(t)g₁),and (2) residual graph set equivalence tests (e.g.,R(G_(p),g₂=R(G_(p),g₁)). Accordingly, the method 200 may further includeminimizing this overhead.

With continued reference to FIG. 1, in block 106, the method 100 mayinclude minimizing overhead from subgraph tests, as shown in block 107,and minimizing overhead from residual graph set equivalence tests, asshown in block 108. In some embodiments, when pruning is at least one ofsubgraph pruning and/or supergraph pruning, the method may includeeither one or both of blocks 107 and 108.

In block 107, the method 100 may include minimizing overhead fromsubgraph tests. In an embodiment, minimizing overhead from subgraphtests may include representing temporal graphs by sequences using anencoding scheme and employing a light-weight algorithm based onsubsequence tests. Given two temporal graphs g and g′, it is NP-completeto decide g⊂ _(t)g′. Since edges are totally ordered in temporal graphs,temporal graphs may be encoded into sequences. In addition, aftertemporal graphs are represented as sequences, a faster temporal subgraphtest may be employed using efficient subsequence tests.

A temporal graph pattern g may be represented by two sequences, namely anode sequence and an edge sequence. A node sequence, nodeseq(g) is asequence of labeled nodes. Given g is traversed by its edge temporalorder, nodes in nodeseq(g) may be ordered by their first visited time.Any node of g may appear only once in nodeseq(g). An edge sequence,edgeseq(g), is a sequence of edges in g, where edges are ordered bytheir timestamps. A sequence may be defined as s, such that s₁=(a₁,a₂, .. . , a_(n)) and s₂=(b₁,b₂, . . . , b_(m)) are two sequences, where a isan element in the sequence s₁ (where a_(i) is the i-th element in thesequence s₁), b is an element in the sequence s₂ (where b_(i) is thei-th element in the sequence s₂), n is the total number of elements inthe sequence s₁, and m is the total number of elements in the sequences₂. If there exists 1≦i₁<i₂< . . . <i_(n)≦m such that ∀1≦j≦n,a_(j)=b_(i) _(j) , then s₁ is a subsequence of s₂, denoted as s₁ ⊂s₂. Itshould be noted that i₁, i₂, . . . , i_(n) are n integer variables inthe range between 1 and m and j is an integer variable in the rangebetween 1 and n. For example, if n=5, m=7, then s₁ is a sequence of fiveelements as s₁=(a₁,a₂,a₃,a₄,a₅) and s₂ is a sequence of seven elementsas s₂=(b₁,b₂,b₃,b₄,b₅,b₆,b₇). In this case, i₁, i₂, . . . , i₅ are fiveinteger variables that are no smaller than 1 and no greater than 7. Interms of mapping, j maps to i_(j) (e.g., j=2 maps to i₂ so that a₂ mapsb_(i2)). An illustrative example of sequence-based temporal graphrepresentation and temporal subgraph test is illustratively shown inFIG. 8, which will be described in further detail below.

In an embodiment, the minimizing overhead from subgraph tests includesproviding an enhanced node sequence of a temporal graph, enhseq(g). Thisis because, given two temporal graphs g₁ and g₂, if g₁ ⊂ _(t)g₂,nodeseq(g₁)⊂nodeseq(g₂). Accordingly, if g is a temporal graph,enhseq(g) is a sequence of labeled nodes in g. Given that temporal graphpattern g is traversed by its edge temporal order, enhseq(g) may beconstructed by processing each edge (u,v,t) as follows. (1) If u is thelast added node in the current enhseq(g), or u is the source node of thelast processed edge, u may be skipped; otherwise, u will be added intothe enhseq(g). (2) Node v may be always added into enhseq(g). It shouldbe noted that nodes in g might appear multiple times in enhseq(g).

Accordingly, two temporal graphs g₁ ⊂ _(t)g₂ if and only if:

nodeseq(g₁)⊂edgeseq(g₂), where the underlying match forms an injectivenode mapping f_(s) from nodes in g₁ to nodes in g₂; and

f_(s)(edgeseq(g₁))⊂edgeseq(g₂) where f_(s)(edgeseq(g₁)) is an edgesequence where the nodes in g₁ are replaced by the nodes in g₂ via thenode mapping f_(s). This may be referred to as Lemma 5.

In block 108, the method 100 may include minimizing overhead fromresidual graph set equivalence tests. In an embodiment, g₁ and g₂represent temporal graph patterns. Accordingly, G₁′ and G₂′ may be thematches of temporal graph patterns g₁ and g₂ in temporal graph G,respectively. Since edges in temporal graphs have total order, thefollowing result may be derived: the residual graph R(G,G₁′) isequivalent to the residual graph R(G,G₂′) if and only if the size of theresidual graph for G₁′ and G₂′ are the same, e.g.,|R(G,G₁′)|=|R(G,G₂′)|. Thus, given temporal graph patterns g₁ and g₂with g₁ ⊂g₂, and a set of graphs G, residual graphs R(G,g₁)=R(G,g₂) ifand only if I(G,g₁)=I(G,g₂), where

${I\left( {G,g_{i}} \right)}{\sum\limits_{{R{({G,G^{\prime}})}} \in {r{({G,g_{i}})}}}^{\;}\; {{{R\left( {G,G^{\prime}} \right)}}.}}$

This may be referred to as Lemma 6. R(G,G′) is a residual graph, and|R(G,G′)| is the size of R(G,G′), which is an integer. Therefore,I(G,g_(i)) is a function with two variables G and g_(i), which returnsan integer obtained by summing up the sizes of all residual graphs inthe graph set R(G,g_(i)). Accordingly, overhead may be minimized bytesting equivalent residual graph sets by leveraging temporalinformation in graphs.

Advantageously, pruning redundant searches of temporal graph patternsthat share similar and/or identical growth trends minimizes overhead oftemporal subgraph tests and residual graph set equivalence tests thatare used for identifying pruning opportunities. In addition, pruningredundant searches of temporal graph patterns increases computation timeand minimizes overhead during the mining process, since the underlyingpattern space could be large and a typical naive search algorithm cannotscale.

In block 110, behavior queries based on the discriminative temporalgraphs may be generated. In an embodiment, patterns with the highestdiscriminative score may be selected as queries to search targetbehavior activities from a repository of system data logs to determineif there are abnormal and/or suspicious activities occurring (e.g., toomany times a target behavior occurs over a Saturday night). For example,the discriminative temporal graph may be used to construct behaviorqueries, and may subsequently be employed to query a computer system,such as system data logs, to determine if target behaviors have beenperformed. For example, the discriminative temporal graph may be used toform a graph query (e.g. a behavior query) to search the existence of atarget behavior in collected system monitoring data. To search theexistence of a target behavior in the system, the graph query may beused to perform a pattern search over the large temporal graph of thesystem data to find subgraphs of the large temporal graph that match thequery. Each match may indicate one possible existence of the targetbehavior in the system. In an embodiment, the present principles may beapplied to behavior queries with multiple behaviors. For example, foreach target behavior, its discriminative pattern is determined togenerate respective behavior queries, and the respective behaviorqueries are employed to search the system monitoring data for itsexistence (e.g. match). In another embodiment, the matches may beconnected to form a behavior queries associated with the multiplebehaviors. Advantageously, the present principles increase computationefficiency and reduce storage of such information, since repeatedsearches and/or patterns are pruned.

The method 100 provides an effective method for behavior analysis, withbehavior queries having high precision (e.g., 97%) and high recall(e.g., 91%), which are better than non-temporal graph patterns whoseprecision and recall are 83% and 91%, respectively. Precision and recallare generally used as the metrics to evaluate the accuracy of thepresent principles. Given a target behavior and its behavior query, amatch of this behavior query is called an identified instance. Anidentified instance is correct if the time interval during which thematch happened is fully contained in a time interval during which one ofthe true behavior instances was under execution. A behavior instance isdiscovered if the behavior query can return at least one correctidentified instance with respect to this behavior instance. Accordingly,precision is defined as the number of correctly identified instancesdivided by the total number of identified instances, and recall isdefined as the number of discovered instances divided by the number ofbehavior instances. In addition to these advantages, the presentprinciples provided herein are more efficient and enable fast patternmining in temporal graphs than previous methods, typically providingpattern mining approximately thirty-two times faster than previouslyemployed methods.

It should be noted that discriminative graph pattern mining dealing withnon-temporal graphs require identical activities happening within theexact same time intervals. In addition, it is difficult to extendexisting works that mine discriminative static graph patterns to handletemporal graphs, since their canonical labeling techniques cannot dealwith temporal graphs which could have multiple edges between same pairof nodes and include temporal edge orders. Moreover, discriminativegraph pattern mining dealing with non-temporal graphs do not discuss howto deal with timestamps in the mining process. If timestamps areignored, multi-edges must be collapsed into a single edge, and the finalresult of the discriminative mining will be a partial result, as itexcludes patterns with multi-edges. In addition, a redundancy innon-temporal patterns may bring potential scalability problems, as alarge number of temporal patterns may share the same non-temporalpatterns, and a discriminative non-temporal pattern may result in nodiscriminative temporal pattern.

Now referring to FIG. 2, several temporal graphs are shown forillustrative purposes. In an embodiment, it is preferable to usetemporal graphs with total edge order. As shown in FIG. 2, temporalgraph G₁ illustrates multi-edges as contemplated in the presentinvention. According to the present principles, temporal graphs thatinclude node labels (e.g., A, B, C, D, E, etc.) and/or edge timestamps(e.g., 1, 2, 3, 4, 5, 6, 7, etc.) are contemplated in addition totemporal graphs with edge labels. In one embodiment, the timestamps inthe temporal graph patterns may be aligned (e.g., from 1 to |E|) and, insome embodiments, only total edge order is kept, unlike general temporalgraphs where timestamps could be arbitrary non-negative integers.

In FIG. 2, an example of a temporal subgraph is illustratively depicted,where G₂ is a temporal subgraph of G₁, namely G₂ ⊂ ^(t)G₁. Inparticular, the temporal subgraph in G₁, which may be formed by edges ofthe timestamps (e.g., 4, 5, and 6), is a match of G₂. With continuedreference to FIG. 2, temporal graphs G₁ and G₂ are T-connected temporalgraphs while temporal graph G₃ is not T-connected (e.g., nonT-connected), since the graph formed by edges with timestamps smallerthan five (e.g., 5) is disconnected. In a preferred embodiment,discriminative mining is employed with T-connected temporal graphpatterns (hereinafter referred to as “connected temporal graphs”). Inpattern growth, T-connected patterns remain connected, while nonT-connected patterns might be disconnected during the growth process,resulting in formidable growth of pattern search space. In addition, anynon T-connected temporal graph may be formed by a set of T-connectedtemporal graphs. In an embodiment, a single T-connected pattern or a setof T-connected patterns that include a non T-connected pattern may beused to form a behavior query.

Now referring to FIG. 3, an example of a consecutive growth pattern 300for patterns of temporal graph patterns is illustrated for exemplarypurposes. In FIG. 3, a consecutive growth pattern 300 may be determinedwhen a temporal graph pattern g₁ is grown to temporal graph pattern g₄by consecutive growth. In an embodiment, consecutive growth occurs when,given a connected temporal graph pattern g of edge set E and an edgee′=(u′,v′,t′), edge e′ is added into g and another connected temporalgraph pattern and t′=|E|+1 results.

For example, assuming g₁ and g₂ are connected temporal graph patternswith g₁ ⊂g₂, a pattern is a consecutive growth pattern when there existsa unique way to grow g₁ into g₂. Alternatively, a pattern is not aconsecutive growth pattern then there is no way to grow g₁ into g₂. Thismay be referred to herein as Lemma 3. If the edge sets of g₁ and g₂ areE₁ and E₂, respectively, m=|E₂|−|E₁| steps of consecutive growth may beconducted to grow g₁ into another pattern g₂′. If there existsg₂′=_(t)g₂, then it may be possible to grow g₁ into g₂. Otherwise, thereis no way to grow g₁ to g₂. If g₁ may be grown into g₂, then the m stepsof consecutive growth is unique.

For example, assume that (1) s′=

e₁′,e₂′, . . . , e_(m)′

is a sequence of consecutive growth that grows g₁ into g₂′ withg₂′=_(t)g₂, (2) s″=

e₁″,e₂″, . . . , e_(m)

is another sequence of consecutive growth that grows g₁ into g₂″ withg₂″=_(t)g₂, and (3) s′ is distinct from s″ as ∃(u′,v′,t′)∈s′ cannotmatch (u″,v″,t″)∈s″. Since g₂′=_(t)g₂ and g₂″=_(t)g₂, g₂′=_(t)g₂″ may beinferred by the bijective mapping functions. By the definition of aconsecutive growth pattern, the linear scan from Lemma 2 may decide g₂′cannot match g₂″, since there exists at least one edge from s′ thatcannot match the edge in s″ sharing the same timestamp, whichcontradicts with g₂′=_(t)g₂″. Thus, s′ is identical to s″, and the msteps of consecutive growth is unique.

Now referring to FIGS. 4A-4C, the consecutive growth pattern may includeat least one of a forward growth pattern, a backward growth pattern, oran inward growth pattern, which will be described in further detailbelow. FIG. 4A is an illustrative example of a forward growth pattern.FIG. 4B is an illustrative example of a backward growth pattern. FIG. 4Cis an illustrative example of an inward growth pattern. Advantageously,the forward growth pattern, backward growth pattern and/or inward growthpattern enable the non-repetitive graph pattern to cover the wholepattern space to achieve completeness and guarantee the quality ofdiscovered patterns.

For example, letting g be a connected temporal graph pattern with nodeset V, temporal graph pattern g may be grown by consecutive growth asfollows. If the non-repetitive graph pattern includes a forward growthpattern 400A, as shown in FIG. 4A, then temporal graph pattern g may begrown by an edge (u,v,t) if u∈V and v∉V. If the non-repetitive graphpattern includes a backward growth pattern 400B, as shown in FIG. 4B,then temporal graph pattern g may be grown by an edge (u,v,t) if u∉V andv∈V. If the non-repetitive graph pattern includes an inward growthpattern 400C, as shown in FIG. 4C, then temporal graph pattern g may begrown by an edge (u,v,t) if u∈V and v∈V. It should be noted that theinward growth pattern 400C allows multi-edges between node pairs.Accordingly, the three growth patterns, namely forward 400A, backward400B, and inward 400C, provide guidance to conduct a complete searchover the pattern space.

For example, if A represents a search algorithm following consecutivegrowth with forward, backward, and inward growth patterns, algorithm Aguarantees (1) a complete search over pattern space, and (2) no patternwill be searched more than once. This may be referred to herein asTheorem 1. Assuming temporal graph pattern g is a connected temporalgraph pattern, Lemma 3 states that a consecutive growth patternguarantees a unique way to grow an empty pattern into g to ensure thatno pattern may be searched more than once. Thus, there is no way tosearch g more than once. For completeness over the pattern search,assume m is the number of edges in a temporal graph pattern. If thecompleteness holds for m=k, then it holds for m=k+1. Assuming thecompleteness holds for m=k, the complete set of k-edge connectedtemporal graph patterns H^((k)) is determined. Further, ifg^((k+1))=g^((k))∪{e} is a connected pattern of k+1 edges that is grownfrom a pattern g^((k)) of k edges, and since the three growth patternsare all possible ways to keep patterns connected during growth, ifg^((k+1)) cannot be covered by growing patterns in H^((k)), it impliesg^((k))∉H^((k)), that is, g^((k)) is not connected, which contradictswith the assumption that g^((k+1)) is connected (e.g., T-connected).Therefore, the completeness also holds for m=k+1.

Now referring to FIG. 5, an illustrative example of a temporal graphpattern g, a temporal graph G, a temporal subgraph G′, a residual graphR(G,G′), and a residual node label set L_(R) (G,G′)={A_(R) (u)|∀u∈V_(R)}is illustratively shown, in accordance with the present principles. Asshown in FIG. 5, temporal graph G′ is a subgraph of temporal graph G,and R(G,G′) represents G's residual graph with respect to G′, andL_(R)(G,G′) is the residual graph's residual node set.

Now referring to FIG. 6, an illustrative example of a subgraph pruning600 is illustratively depicted, in accordance with the presentprinciples. In the mining process, a pattern g₂ may be determined and adiscovered pattern g₁ may exist, which satisfies the conditions insubgraph pruning. Therefore, pattern growth in g₁'s branch suggests howto grow g₂ to larger patterns (e.g., growing g₁ to g₁′ indicates we cangrow g₂ to g₂′). Since none of the patterns in g₁'s branch have thescore F″, the patterns in g₂'s branch cannot be the most discriminativeones as well, which can be safely pruned (e.g., removed).

Now referring to FIG. 7, an illustrative example of a supergraph pruning700 is illustratively depicted, in accordance with the presentprinciples. In the mining process, a temporal graph pattern g₂ may bedetermined, and another pattern g₁ may be discovered before g₂, whichsatisfies the conditions in supergraph pruning. Therefore, the growthknowledge in g₁'s branch suggests how to grow g₂ to larger patterns.Since none of the patterns in g₁'s branch are the most discriminative,it may be inferred that the patterns in g₂'s branch are unpromising aswell, and the search in g₂'s branch may be safely pruned (e.g.,removed).

Now referring to FIG. 8, an illustrative example of a sequence-basedrepresentation 800 is illustratively depicted, in accordance with thepresent principles. In g₁ and g₂, node labels are represented byletters, and nodes of the same labels are differentiated by their nodeIDs represented by integers in brackets. Node labels in nodeseq areassociated with node IDs as subscripts. It should be noted that whennode labels are compared, their subscripts will be ignored (e.g., ∀i, j,B_(i)=B_(j)). Each edge in edgeseq is represented by the followingformat (id(u),id(v)), where id(u) is the source node ID and id(v) is thedestination node ID.

Given two temporal graphs g₁ and g₂, if g₁ ⊂ _(t)g₂, it is expected thatnodeseq(g₁)⊂nodeseq(g₂) and edgeseq(g₁)⊂edgeseq(g₂). However, when g₁ ⊂_(t)g₂, nodeseq(g₁)⊂nodeseq(g₂) may not be true, as shown in FIG. 8,because the first visited time of the node with label E is inconsistentin g₁ and g₂. In an embodiment, as described above, enhanced nodesequences of g₁ and g₂ may be provided. As shown in FIG. 8, g₁ and g₂are two temporal graphs satisfying g₁ ⊂ _(t)g₂. The node sequence of g₁is a subsequence of the enhanced node sequence of g₂ with the injectivenode mapping f_(s)(1)=1, f_(s)(2)=5, f_(s)(3)=6, and f_(s)(4)=4 toobtain f_(s)(edgeseq(g₁))=

(1,5), (5,6),(4,6)

such that f_(s)(edgeseq(g₁))⊂edgeseq(g₂).

It should be understood that embodiments described herein may beentirely hardware, or may include both hardware and software elementswhich includes, but is not limited to, firmware, resident software,microcode, etc.

Embodiments may include a computer program product accessible from acomputer-usable or computer-readable medium providing program code foruse by or in connection with a computer or any instruction executionsystem. A computer-usable or computer readable medium may include anyapparatus that stores, communicates, propagates, or transports theprogram for use by or in connection with the instruction executionsystem, apparatus, or device. The medium can be magnetic, optical,electronic, electromagnetic, infrared, or semiconductor system (orapparatus or device) or a propagation medium. The medium may include acomputer-readable storage medium such as a semiconductor or solid statememory, magnetic tape, a removable computer diskette, a random accessmemory (RAM), a read-only memory (ROM), a rigid magnetic disk and anoptical disk, etc.

A data processing system suitable for storing and/or executing programcode may include at least one processor, e.g., a hardware processor,coupled directly or indirectly to memory elements through a system bus.The memory elements can include local memory employed during actualexecution of the program code, bulk storage, and cache memories whichprovide temporary storage of at least some program code to reduce thenumber of times code is retrieved from bulk storage during execution.Input/output or I/O devices (including but not limited to keyboards,displays, pointing devices, etc.) may be coupled to the system eitherdirectly or through intervening I/O controllers.

Now referring to FIG. 9, an exemplary processing system 900 to which thepresent principles may be applied is illustratively depicted inaccordance with one embodiment of the present principles. The processingsystem 900 includes at least one processor (“CPU”) 904 operativelycoupled to other components via a system bus 902. A cache 906, a ReadOnly Memory (“ROM”) 908, a Random Access Memory (“RAM”) 910, aninput/output (“I/O”) adapter 920, a sound adapter 930, a network adapter940, a user interface adapter 950, and a display adapter 960, areoperatively coupled to the system bus 902.

A storage device 922 and a second storage device 924 are operativelycoupled to system bus 902 by the I/O adapter 920. The storage devices922 and 924 can be any of a disk storage device (e.g., a magnetic oroptical disk storage device), a solid state magnetic device, and soforth. The storage devices 922 and 924 can be the same type of storagedevice or different types of storage devices.

A speaker 932 is operatively coupled to system bus 902 by the soundadapter 930. A transceiver 942 is operatively coupled to system bus 902by network adapter 940. A display device 962 is operatively coupled tosystem bus 902 by display adapter 960.

A first user input device 952, a second user input device 954, and athird user input device 956 are operatively coupled to system bus 902 byuser interface adapter 950. The user input devices 952, 954, and 956 canbe any of a keyboard, a mouse, a keypad, an image capture device, amotion sensing device, a microphone, a device incorporating thefunctionality of at least two of the preceding devices, and so forth. Ofcourse, other types of input devices can also be used. The user inputdevices 952, 954, and 956 can be the same type of user input device ordifferent types of user input devices. The user input devices 952, 954,and 956 are used to input and output information to and from system 900.

Of course, the processing system 900 may also include other elements(not shown), as readily contemplated by one of skill in the art, as wellas omit certain elements. For example, various other input devicesand/or output devices can be included in processing system 900,depending upon the particular implementation of the same, as readilyunderstood by one of ordinary skill in the art. For example, varioustypes of wireless and/or wired input and/or output devices can be used.Moreover, additional processors, controllers, memories, and so forth, invarious configurations can also be utilized as readily appreciated byone of ordinary skill in the art. These and other variations of theprocessing system 900 are readily contemplated by one of ordinary skillin the art given the teachings of the present principles providedherein.

Moreover, it is to be appreciated that system 1000 described below, withrespect to FIG. 10, is a system for implementing respective embodimentsof the present principles. Part or all of processing system 900 may beimplemented in one or more of the elements of system 1000.

Further, it is to be appreciated that processing system 900 may performat least part of the method described herein including, for example, atleast part of method 100 of FIG. 1. Similarly, part or all of system1000 may be used to perform at least part of method 100 of FIG. 1.

FIG. 10 shows an exemplary system 1000 for constructing behavior queriesin temporal graphs using discriminative sub-trace mining, in accordancewith one embodiment of the present principles. While many aspects ofsystem 1000 are described in singular form for the sake of illustrationand clarity, the same can be applied to multiple ones of the itemsmentioned with respect to the description of system 1000. For example,while a pattern pruner 1010 is described, more than one pattern pruners1010 may be used in accordance with the teachings of the presentprinciples.

The system 1000 may include a monitoring device 1002, a system data logdatabase 1004, a temporal graph generator 1006, a temporal graph patterngenerator 1008, a pattern determiner 1010, a pattern pruner 1012, abehavior query generator 1014, and a storage device 1016.

The monitoring device 1002 may be configured to monitoring system dataof a computer system. For example, the monitoring device 1002 maymonitor execution of behavior traces at the computer system. Inaddition, the monitoring device 1002 may be configured to generatesystem data logs, which may be stored in the system data log database1004 and may be accessed by various components of the system 1000. Asdescribed above, system data logs may include raw system behaviors,target behaviors and/or background behaviors, and may be monitored andcollected by monitoring device 1002 and may be employed as input data.In addition, the system data logs may include information relating tohow system entities interact with each other at the operating system andmay include timestamps. In a further embodiment, monitoring device 1002may be configured to monitor system data in a closed environment, wheretarget behaviors and/or background behaviors are performed independentlyof each other.

The temporal graph generator 1006 may be configured to provide temporalgraphs corresponding to the system data logs. In an embodiment, thetemporal graph generator 1006 may be configured to provide a firsttemporal graph corresponding to a target behavior and a second temporalgraph corresponding to a set of background behaviors. In a furtherembodiment, temporal graph generator 1006 may be configured to providetemporal subgraphs corresponding to the system data logs.

The temporal graph pattern generator 1008 may be configured to generatetemporal graph patterns for each of the temporal graphs. For example,temporal graph pattern generator 1008 may provide a first temporal graphpattern for a first temporal graph and a second temporal graph patternfor a second temporal graph. In a further embodiment, the temporal graphpattern generator 1008 may generate temporal graph patterns that areT-connected graph patterns.

The pattern determiner 1010 may be configured to determine whether ornot a pattern exits between the temporal graph patterns. For example,the pattern determiner 1010 may determine if a pattern exists between afirst temporal graph pattern and a second temporal graph pattern. In afurther embodiment, the pattern determiner 1010 may be configured todetermine a non-repetitive graph pattern and/or consecutive graphpattern between the first and second temporal graph patterns. Forexample, the pattern determiner 1010 may determine a pattern betweentemporal graph patterns when each edge in a first temporal graph patterncorresponds to each edge in a second temporal graph pattern such thatthe node mappings between each edge are one-to-one. In a furtherembodiment, the pattern determiner 1010 may determine at least one of aforward growth pattern, a backward growth pattern, or an inward growthpattern, as described above. Advantageously, the pattern determiner 1010may determine a non-repetitive pattern without the need for canonicallabeling techniques.

The pattern pruner 1012 may be configured to prune the determinedpattern to provide discriminative temporal graphs. In one embodiment,the pattern pruner 1012 may prune the patterns to select only thosesub-relations with maximum frequency and/or maximum discriminativescore. In a further embodiment, the pattern pruner 1012 may prunetemporal sub-relations using subgraph pruning and/or supergraph pruning,as described above. In yet a further embodiment, the pattern pruner 1012may be configured to prune the pattern between the temporal graphpatterns by determining a set of residual graphs for each temporal graphpattern. In yet a further embodiment, the pattern pruner 1012 may beconfigured to minimize overhead from subgraph tests and minimizeoverhead from residual graph set equivalence tests.

The behavior query generator 1014 may be configured to generate behaviorqueries based on the discriminative temporal graphs. In an embodiment,behavior query generator 1014 may select patterns with the highestdiscriminative score as behavior queries to search target behavioractivities from a repository of system data logs to determine if thereare abnormal and/or suspicious activities occurring on a computersystem. The behavior queries can then be stored on storage device 1016.

It should be noted that while the above configuration is illustrativelydepicted, it is contemplated that other sorts of configurations may alsobe employed according to the present principles. These and othervariations between configurations are readily determined by one ofordinary skill in the art given the teachings of the present principlesprovided herein, while maintaining the present principles.

In some embodiments, monitoring device 1002, system data log database1004, temporal graph generator 1006, temporal graph pattern generator1008, pattern determiner 1010, pattern pruner 1012, behavior querygenerator 1014 and/or storage device 1016 of system 1000 may be avirtual appliance (e.g., computing device, node, server, etc.), and maybe directly connected to a network or located remotely for controllingvia any type of transmission medium (e.g., Internet, intranet, internetof things, etc.). In some embodiments, monitoring device 1002, systemdata log database 1004, temporal graph generator 1006, temporal graphpattern generator 1008, pattern determiner 1010, pattern pruner 1012,behavior query generator 1014 and/or storage device 1016 may be ahardware device, and may be attached to a network or built into anetwork according to the present principles.

In the embodiment shown in FIG. 10, the elements thereof areinterconnected by a bus 1001. However, in other embodiments, other typesof connections can also be used. Moreover, in one embodiment, at leastone of the elements of system 1000 is processor-based. Further, whileone or more elements may be shown as separate elements, in otherembodiments, these elements can be combined as one element. The converseis also applicable, where while one or more elements may be part ofanother element, in other embodiments, the one or more elements may beimplemented as standalone elements. These and other variations of theelements of system 1100 are readily determined by one of ordinary skillin the art, given the teachings of the present principles providedherein.

The foregoing is to be understood as being in every respect illustrativeand exemplary, but not restrictive, and the scope of the inventiondisclosed herein is not to be determined from the Detailed Description,but rather from the claims as interpreted according to the full breadthpermitted by the patent laws. It is to be understood that theembodiments shown and described herein are only illustrative of theprinciples of the present invention and that those skilled in the artmay implement various modifications without departing from the scope andspirit of the invention. Those skilled in the art could implementvarious other feature combinations without departing from the scope andspirit of the invention.

What is claimed is:
 1. A computer implemented method for constructingbehavior queries in temporal graphs using discriminative sub-tracemining, comprising: generating system data logs to provide temporalgraphs, wherein the temporal graphs include at least a first temporalgraph corresponding to a target behavior and a second temporal graphcorresponding to a set of background behaviors; generating temporalgraph patterns for each of the first and second temporal graphs todetermine whether a pattern exists between a first temporal graphpattern and a second temporal graph pattern, wherein the pattern betweenthe temporal graph patterns is a non-repetitive graph pattern; pruningthe pattern between the temporal graph patterns to provide at least onediscriminative temporal graph; and generating behavior queries based onthe at least one discriminative temporal graph.
 2. The computerimplemented method according to claim 1, wherein the pattern isdetermined when each edge in the first temporal graph patterncorresponds to each edge in the second temporal graph pattern such thatnode mappings between each edge are one-to-one.
 3. The computerimplemented method according to claim 1, wherein the pattern includestemporal graph patterns that are identical in linear time.
 4. Thecomputer implemented method according to claim 1, wherein the systemdata logs are generated in a closed environment such that the at leastone target behavior is performed independently from the set ofbackground behaviors.
 5. The computer implemented method according toclaim 1, wherein the pattern includes a consecutive growth pattern. 6.The computer implemented method according to claim 5, wherein theconsecutive growth pattern includes at least one of a forward growthpattern, a backward growth pattern, and an inward growth pattern.
 7. Thecomputer implemented method according to claim 1, wherein the temporalgraphs are T-connected temporal graphs.
 8. The computer implementedmethod according to claim 1, wherein pruning includes at least one ofsubgraph pruning and supergraph pruning.
 9. The computer implementedmethod according to claim 1, further comprising minimizing overheardfrom at least one of subgraph tests and residual graph set equivalencetests.
 10. A system for constructing behavior queries in temporal graphsusing discriminative sub-trace mining, comprising: a monitoring deviceto generate system data logs to provide temporal graphs, wherein thetemporal graphs include at least a first temporal graph corresponding toa target behavior and a second temporal graph corresponding to a set ofbackground behaviors; a temporal graph pattern generator to generatetemporal graph patterns for each of the first and second temporalgraphs; a pattern determiner to determine whether a pattern existsbetween a first temporal graph pattern and a second temporal graphpattern, wherein the pattern between the temporal graph patterns is anon-repetitive graph pattern; a pattern pruner comprising a processor,coupled to a bus, to prune the pattern between the temporal graphpatterns to provide at least one discriminative temporal graph; and abehavior query generator, coupled to the bus, to generate behaviorqueries based on the at least one discriminative temporal graph.
 11. Thesystem according to claim 10, wherein the pattern is determined wheneach edge in the first temporal graph pattern corresponds to each edgein the second temporal graph pattern such that node mappings betweeneach edge are one-to-one.
 12. The system according to claim 10, themonitoring device is further configured to generate the system data logsin a closed environment such that the at least one target behavior isperformed independently from the set of background behaviors.
 13. Thesystem according to claim 10, wherein the pattern includes a consecutivegrowth pattern.
 14. The system according to claim 13, wherein theconsecutive growth pattern includes at least one of a forward growthpattern, a backward growth pattern, and an inward growth pattern. 15.The system according to claim 11, wherein the pattern pruner is furtherconfigured to prune using at least one of subgraph pruning andsupergraph pruning.
 16. A computer program product comprising anon-transitory computer readable storage medium having computer readableprogram code embodied therein for a method for constructing behaviorqueries in temporal graphs using discriminative sub-trace mining, themethod comprising: generating system data logs to provide temporalgraphs, wherein the temporal graphs include at least a first temporalgraph corresponding to a target behavior and a second temporal graphcorresponding to a set of background behaviors; generating temporalgraph patterns for each of the first and second temporal graphs todetermine whether a pattern exists between a first temporal graphpattern and a second temporal graph pattern, wherein the pattern betweenthe temporal graph patterns is a non-repetitive graph pattern; pruningthe pattern between the temporal graph patterns to provide at least onediscriminative temporal graph; and generating behavior queries based onthe at least one discriminative temporal graph.
 17. The computer programproduct of claim 16, wherein the pattern is determined when each edge inthe first temporal graph pattern corresponds to each edge in the secondtemporal graph pattern such that node mappings between each edge areone-to-one.
 18. The computer program product of claim 16, wherein thesystem data logs are generated in a closed environment such that the atleast one target behavior is performed independently from the set ofbackground behaviors.
 19. The computer program product of claim 16,wherein pruning includes at least one of subgraph pruning and supergraphpruning.
 20. The computer program product of claim 19, furthercomprising minimizing overheard from at least one of subgraph tests andresidual graph set equivalence tests.